Overview

Dataset statistics

Number of variables25
Number of observations65229
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.4 MiB
Average record size in memory200.0 B

Variable types

Numeric11
Categorical14

Alerts

country has a high cardinality: 155 distinct values High cardinality
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
id is highly correlated with is_canceled and 4 other fieldsHigh correlation
is_canceled is highly correlated with idHigh correlation
lead_time is highly correlated with idHigh correlation
arrival_date_year is highly correlated with id and 3 other fieldsHigh correlation
arrival_date_month is highly correlated with id and 2 other fieldsHigh correlation
arrival_date_week_number is highly correlated with id and 2 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with total_nightsHigh correlation
stays_in_week_nights is highly correlated with total_nightsHigh correlation
children is highly correlated with reserved_room_typeHigh correlation
previous_cancellations is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with previous_cancellationsHigh correlation
reserved_room_type is highly correlated with childrenHigh correlation
customer_type is highly correlated with arrival_date_yearHigh correlation
total_nights is highly correlated with stays_in_weekend_nights and 1 other fieldsHigh correlation
previous_cancellations is highly skewed (γ1 = 20.72991635) Skewed
previous_bookings_not_canceled is highly skewed (γ1 = 24.30889795) Skewed
id has unique values Unique
lead_time has 3423 (5.2%) zeros Zeros
stays_in_week_nights has 4007 (6.1%) zeros Zeros
previous_cancellations has 59591 (91.4%) zeros Zeros
previous_bookings_not_canceled has 63686 (97.6%) zeros Zeros
booking_changes has 56195 (86.2%) zeros Zeros
days_in_waiting_list has 62005 (95.1%) zeros Zeros
total_of_special_requests has 40671 (62.4%) zeros Zeros

Reproduction

Analysis started2022-07-06 05:02:23.930007
Analysis finished2022-07-06 05:02:59.929846
Duration36 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct65229
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43544.06917
Minimum0
Maximum84121
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:00.100173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3293.4
Q119108
median40554
Q367414
95-th percentile80659.6
Maximum84121
Range84121
Interquartile range (IQR)48306

Descriptive statistics

Standard deviation25614.85897
Coefficient of variation (CV)0.5882513844
Kurtosis-1.329710141
Mean43544.06917
Median Absolute Deviation (MAD)24186
Skewness-0.0616717062
Sum2840336088
Variance656121000.1
MonotonicityStrictly increasing
2022-07-06T08:03:00.259783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
280071
 
< 0.1%
832661
 
< 0.1%
709841
 
< 0.1%
730331
 
< 0.1%
668901
 
< 0.1%
689391
 
< 0.1%
791801
 
< 0.1%
812291
 
< 0.1%
750861
 
< 0.1%
Other values (65219)65219
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
841211
< 0.1%
841171
< 0.1%
840941
< 0.1%
840631
< 0.1%
840571
< 0.1%
840561
< 0.1%
840501
< 0.1%
840231
< 0.1%
840221
< 0.1%
840161
< 0.1%

is_canceled
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0
41185 
1
24044 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters65229
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
041185
63.1%
124044
36.9%

Length

2022-07-06T08:03:00.420115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:00.569802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
041185
63.1%
124044
36.9%

Most occurring characters

ValueCountFrequency (%)
041185
63.1%
124044
36.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number65229
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
041185
63.1%
124044
36.9%

Most occurring scripts

ValueCountFrequency (%)
Common65229
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
041185
63.1%
124044
36.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII65229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
041185
63.1%
124044
36.9%

lead_time
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct369
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.33791718
Minimum0
Maximum374
Zeros3423
Zeros (%)5.2%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:00.699810image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q117
median64
Q3151
95-th percentile301
Maximum374
Range374
Interquartile range (IQR)134

Descriptive statistics

Standard deviation96.12754464
Coefficient of variation (CV)0.9978163059
Kurtosis0.040622905
Mean96.33791718
Median Absolute Deviation (MAD)55
Skewness1.036433896
Sum6284026
Variance9240.504839
MonotonicityNot monotonic
2022-07-06T08:03:00.853014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03423
 
5.2%
11984
 
3.0%
21156
 
1.8%
31051
 
1.6%
4994
 
1.5%
5944
 
1.4%
6820
 
1.3%
7706
 
1.1%
12664
 
1.0%
8663
 
1.0%
Other values (359)52824
81.0%
ValueCountFrequency (%)
03423
5.2%
11984
3.0%
21156
 
1.8%
31051
 
1.6%
4994
 
1.5%
5944
 
1.4%
6820
 
1.3%
7706
 
1.1%
8663
 
1.0%
9574
 
0.9%
ValueCountFrequency (%)
37420
 
< 0.1%
3731
 
< 0.1%
37233
 
0.1%
36720
 
< 0.1%
36561
0.1%
364123
0.2%
36327
 
< 0.1%
3621
 
< 0.1%
3616
 
< 0.1%
36024
 
< 0.1%

arrival_date_year
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
2016
46901 
2015
18328 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters260916
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
201646901
71.9%
201518328
 
28.1%

Length

2022-07-06T08:03:01.016276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:01.152526image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
201646901
71.9%
201518328
 
28.1%

Most occurring characters

ValueCountFrequency (%)
265229
25.0%
065229
25.0%
165229
25.0%
646901
18.0%
518328
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260916
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
265229
25.0%
065229
25.0%
165229
25.0%
646901
18.0%
518328
 
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common260916
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
265229
25.0%
065229
25.0%
165229
25.0%
646901
18.0%
518328
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII260916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
265229
25.0%
065229
25.0%
165229
25.0%
646901
18.0%
518328
 
7.0%

arrival_date_month
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
October
9255 
September
8782 
August
6678 
November
5761 
July
5718 
Other values (7)
29035 

Length

Max length9
Median length7
Mean length6.352419936
Min length3

Characters and Unicode

Total characters414362
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJuly
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowJuly

Common Values

ValueCountFrequency (%)
October9255
14.2%
September8782
13.5%
August6678
10.2%
November5761
8.8%
July5718
8.8%
December5497
8.4%
April4804
7.4%
May4677
7.2%
June4659
7.1%
March4148
6.4%
Other values (2)5250
8.0%

Length

2022-07-06T08:03:01.287606image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
october9255
14.2%
september8782
13.5%
august6678
10.2%
november5761
8.8%
july5718
8.8%
december5497
8.4%
april4804
7.4%
may4677
7.2%
june4659
7.1%
march4148
6.4%
Other values (2)5250
8.0%

Most occurring characters

ValueCountFrequency (%)
e71631
17.3%
r46855
 
11.3%
b32653
 
7.9%
u28983
 
7.0%
t24715
 
6.0%
m20040
 
4.8%
c18900
 
4.6%
a15967
 
3.9%
y15645
 
3.8%
o15016
 
3.6%
Other values (16)123957
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter349133
84.3%
Uppercase Letter65229
 
15.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e71631
20.5%
r46855
13.4%
b32653
9.4%
u28983
8.3%
t24715
 
7.1%
m20040
 
5.7%
c18900
 
5.4%
a15967
 
4.6%
y15645
 
4.5%
o15016
 
4.3%
Other values (8)58728
16.8%
Uppercase Letter
ValueCountFrequency (%)
J12269
18.8%
A11482
17.6%
O9255
14.2%
M8825
13.5%
S8782
13.5%
N5761
8.8%
D5497
8.4%
F3358
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
Latin414362
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e71631
17.3%
r46855
 
11.3%
b32653
 
7.9%
u28983
 
7.0%
t24715
 
6.0%
m20040
 
4.8%
c18900
 
4.6%
a15967
 
3.9%
y15645
 
3.8%
o15016
 
3.6%
Other values (16)123957
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII414362
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e71631
17.3%
r46855
 
11.3%
b32653
 
7.9%
u28983
 
7.0%
t24715
 
6.0%
m20040
 
4.8%
c18900
 
4.6%
a15967
 
3.9%
y15645
 
3.8%
o15016
 
3.6%
Other values (16)123957
29.9%

arrival_date_week_number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct53
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.33977219
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:01.440651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q121
median34
Q342
95-th percentile50
Maximum53
Range52
Interquartile range (IQR)21

Descriptive statistics

Standard deviation13.46402432
Coefficient of variation (CV)0.4296146199
Kurtosis-0.9127391335
Mean31.33977219
Median Absolute Deviation (MAD)10
Skewness-0.3665873391
Sum2044262
Variance181.2799508
MonotonicityNot monotonic
2022-07-06T08:03:01.603304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
422360
 
3.6%
382318
 
3.6%
412238
 
3.4%
392166
 
3.3%
401984
 
3.0%
431883
 
2.9%
371869
 
2.9%
441819
 
2.8%
331748
 
2.7%
361734
 
2.7%
Other values (43)45110
69.2%
ValueCountFrequency (%)
1202
 
0.3%
2335
 
0.5%
3359
0.6%
4485
0.7%
5485
0.7%
6575
0.9%
7805
1.2%
8893
1.4%
9847
1.3%
10890
1.4%
ValueCountFrequency (%)
531502
2.3%
52928
1.4%
51753
1.2%
501256
1.9%
491454
2.2%
481297
2.0%
471419
2.2%
461324
2.0%
451642
2.5%
441819
2.8%

arrival_date_day_of_month
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.84781309
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:01.754953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.748182441
Coefficient of variation (CV)0.5520119648
Kurtosis-1.185115979
Mean15.84781309
Median Absolute Deviation (MAD)8
Skewness0.0009047732918
Sum1033737
Variance76.53069603
MonotonicityNot monotonic
2022-07-06T08:03:02.059953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
172565
 
3.9%
52502
 
3.8%
122298
 
3.5%
162297
 
3.5%
182296
 
3.5%
192289
 
3.5%
302272
 
3.5%
242248
 
3.4%
262247
 
3.4%
202244
 
3.4%
Other values (21)41971
64.3%
ValueCountFrequency (%)
11744
2.7%
22157
3.3%
32040
3.1%
42097
3.2%
52502
3.8%
62074
3.2%
72164
3.3%
82159
3.3%
92220
3.4%
101846
2.8%
ValueCountFrequency (%)
311152
1.8%
302272
3.5%
291940
3.0%
282152
3.3%
272063
3.2%
262247
3.4%
252204
3.4%
242248
3.4%
231811
2.8%
221949
3.0%

stays_in_weekend_nights
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0
29738 
1
17721 
2
17357 
3
 
261
4
 
152

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters65229
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

Length

2022-07-06T08:03:02.210460image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:02.367513image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

Most occurring characters

ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number65229
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common65229
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII65229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
029738
45.6%
117721
27.2%
217357
26.6%
3261
 
0.4%
4152
 
0.2%

stays_in_week_nights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.224792654
Minimum0
Maximum6
Zeros4007
Zeros (%)6.1%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:02.501330image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.354992231
Coefficient of variation (CV)0.6090420284
Kurtosis-0.1484423279
Mean2.224792654
Median Absolute Deviation (MAD)1
Skewness0.6228936977
Sum145121
Variance1.836003945
MonotonicityNot monotonic
2022-07-06T08:03:02.611135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
220137
30.9%
117850
27.4%
312341
18.9%
45131
 
7.9%
55128
 
7.9%
04007
 
6.1%
6635
 
1.0%
ValueCountFrequency (%)
04007
 
6.1%
117850
27.4%
220137
30.9%
312341
18.9%
45131
 
7.9%
55128
 
7.9%
6635
 
1.0%
ValueCountFrequency (%)
6635
 
1.0%
55128
 
7.9%
45131
 
7.9%
312341
18.9%
220137
30.9%
117850
27.4%
04007
 
6.1%

adults
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
2.0
48980 
1.0
13212 
3.0
 
2858
0.0
 
166
4.0
 
13

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters195687
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.048980
75.1%
1.013212
 
20.3%
3.02858
 
4.4%
0.0166
 
0.3%
4.013
 
< 0.1%

Length

2022-07-06T08:03:02.764124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:02.927316image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2.048980
75.1%
1.013212
 
20.3%
3.02858
 
4.4%
0.0166
 
0.3%
4.013
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
065395
33.4%
.65229
33.3%
248980
25.0%
113212
 
6.8%
32858
 
1.5%
413
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130458
66.7%
Other Punctuation65229
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
065395
50.1%
248980
37.5%
113212
 
10.1%
32858
 
2.2%
413
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.65229
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common195687
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
065395
33.4%
.65229
33.3%
248980
25.0%
113212
 
6.8%
32858
 
1.5%
413
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII195687
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
065395
33.4%
.65229
33.3%
248980
25.0%
113212
 
6.8%
32858
 
1.5%
413
 
< 0.1%

children
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0.0
61712 
1.0
 
2165
2.0
 
1336
3.0
 
16

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters195687
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.061712
94.6%
1.02165
 
3.3%
2.01336
 
2.0%
3.016
 
< 0.1%

Length

2022-07-06T08:03:03.082429image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:03.226694image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.061712
94.6%
1.02165
 
3.3%
2.01336
 
2.0%
3.016
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0126941
64.9%
.65229
33.3%
12165
 
1.1%
21336
 
0.7%
316
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130458
66.7%
Other Punctuation65229
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0126941
97.3%
12165
 
1.7%
21336
 
1.0%
316
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.65229
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common195687
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0126941
64.9%
.65229
33.3%
12165
 
1.1%
21336
 
0.7%
316
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII195687
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0126941
64.9%
.65229
33.3%
12165
 
1.1%
21336
 
0.7%
316
 
< 0.1%

babies
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0.0
64777 
1.0
 
447
2.0
 
3
10.0
 
1
9.0
 
1

Length

Max length4
Median length3
Mean length3.000015331
Min length3

Characters and Unicode

Total characters195688
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.064777
99.3%
1.0447
 
0.7%
2.03
 
< 0.1%
10.01
 
< 0.1%
9.01
 
< 0.1%

Length

2022-07-06T08:03:03.379946image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:03.541738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.064777
99.3%
1.0447
 
0.7%
2.03
 
< 0.1%
10.01
 
< 0.1%
9.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0130007
66.4%
.65229
33.3%
1448
 
0.2%
23
 
< 0.1%
91
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130459
66.7%
Other Punctuation65229
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0130007
99.7%
1448
 
0.3%
23
 
< 0.1%
91
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.65229
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common195688
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0130007
66.4%
.65229
33.3%
1448
 
0.2%
23
 
< 0.1%
91
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII195688
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0130007
66.4%
.65229
33.3%
1448
 
0.2%
23
 
< 0.1%
91
 
< 0.1%

meal
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
BB
51697 
HB
7292 
SC
5274 
SC
 
515
FB
 
451

Length

Max length9
Median length9
Mean length8.944733171
Min length2

Characters and Unicode

Total characters583456
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBB
2nd rowBB
3rd rowBB
4th rowFB
5th rowBB

Common Values

ValueCountFrequency (%)
BB 51697
79.3%
HB 7292
 
11.2%
SC 5274
 
8.1%
SC515
 
0.8%
FB 451
 
0.7%

Length

2022-07-06T08:03:03.705748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:03.880841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
bb51697
79.3%
hb7292
 
11.2%
sc5789
 
8.9%
fb451
 
0.7%

Most occurring characters

ValueCountFrequency (%)
452998
77.6%
B111137
 
19.0%
H7292
 
1.2%
S5789
 
1.0%
C5789
 
1.0%
F451
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator452998
77.6%
Uppercase Letter130458
 
22.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B111137
85.2%
H7292
 
5.6%
S5789
 
4.4%
C5789
 
4.4%
F451
 
0.3%
Space Separator
ValueCountFrequency (%)
452998
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common452998
77.6%
Latin130458
 
22.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
B111137
85.2%
H7292
 
5.6%
S5789
 
4.4%
C5789
 
4.4%
F451
 
0.3%
Common
ValueCountFrequency (%)
452998
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII583456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
452998
77.6%
B111137
 
19.0%
H7292
 
1.2%
S5789
 
1.0%
C5789
 
1.0%
F451
 
0.1%

country
Categorical

HIGH CARDINALITY

Distinct155
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
PRT
28831 
FRA
5974 
GBR
5109 
ESP
4900 
DEU
3887 
Other values (150)
16528 

Length

Max length3
Median length3
Mean length2.992518665
Min length2

Characters and Unicode

Total characters195199
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)0.1%

Sample

1st rowGBR
2nd rowGBR
3rd rowPRT
4th rowPRT
5th rowPRT

Common Values

ValueCountFrequency (%)
PRT28831
44.2%
FRA5974
 
9.2%
GBR5109
 
7.8%
ESP4900
 
7.5%
DEU3887
 
6.0%
ITA2327
 
3.6%
IRL1445
 
2.2%
BEL1245
 
1.9%
NLD1166
 
1.8%
BRA1088
 
1.7%
Other values (145)9257
 
14.2%

Length

2022-07-06T08:03:04.039953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt28831
44.2%
fra5974
 
9.2%
gbr5109
 
7.8%
esp4900
 
7.5%
deu3887
 
6.0%
ita2327
 
3.6%
irl1445
 
2.2%
bel1245
 
1.9%
nld1166
 
1.8%
bra1088
 
1.7%
Other values (145)9257
 
14.2%

Most occurring characters

ValueCountFrequency (%)
R44590
22.8%
P34490
17.7%
T32278
16.5%
A11965
 
6.1%
E11783
 
6.0%
B7679
 
3.9%
S7609
 
3.9%
U6917
 
3.5%
F6253
 
3.2%
G5618
 
2.9%
Other values (16)26017
13.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter195199
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R44590
22.8%
P34490
17.7%
T32278
16.5%
A11965
 
6.1%
E11783
 
6.0%
B7679
 
3.9%
S7609
 
3.9%
U6917
 
3.5%
F6253
 
3.2%
G5618
 
2.9%
Other values (16)26017
13.3%

Most occurring scripts

ValueCountFrequency (%)
Latin195199
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R44590
22.8%
P34490
17.7%
T32278
16.5%
A11965
 
6.1%
E11783
 
6.0%
B7679
 
3.9%
S7609
 
3.9%
U6917
 
3.5%
F6253
 
3.2%
G5618
 
2.9%
Other values (16)26017
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII195199
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R44590
22.8%
P34490
17.7%
T32278
16.5%
A11965
 
6.1%
E11783
 
6.0%
B7679
 
3.9%
S7609
 
3.9%
U6917
 
3.5%
F6253
 
3.2%
G5618
 
2.9%
Other values (16)26017
13.3%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
TA/TO
54454 
Direct
6853 
Corporate
 
3823
GDS
 
99

Length

Max length9
Median length5
Mean length5.336460777
Min length3

Characters and Unicode

Total characters348092
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDirect
2nd rowTA/TO
3rd rowDirect
4th rowDirect
5th rowTA/TO

Common Values

ValueCountFrequency (%)
TA/TO54454
83.5%
Direct6853
 
10.5%
Corporate3823
 
5.9%
GDS99
 
0.2%

Length

2022-07-06T08:03:04.180045image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:04.330038image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ta/to54454
83.5%
direct6853
 
10.5%
corporate3823
 
5.9%
gds99
 
0.2%

Most occurring characters

ValueCountFrequency (%)
T108908
31.3%
A54454
15.6%
/54454
15.6%
O54454
15.6%
r14499
 
4.2%
e10676
 
3.1%
t10676
 
3.1%
o7646
 
2.2%
D6952
 
2.0%
i6853
 
2.0%
Other values (6)18520
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter228789
65.7%
Lowercase Letter64849
 
18.6%
Other Punctuation54454
 
15.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r14499
22.4%
e10676
16.5%
t10676
16.5%
o7646
11.8%
i6853
10.6%
c6853
10.6%
p3823
 
5.9%
a3823
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
T108908
47.6%
A54454
23.8%
O54454
23.8%
D6952
 
3.0%
C3823
 
1.7%
G99
 
< 0.1%
S99
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/54454
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin293638
84.4%
Common54454
 
15.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
T108908
37.1%
A54454
18.5%
O54454
18.5%
r14499
 
4.9%
e10676
 
3.6%
t10676
 
3.6%
o7646
 
2.6%
D6952
 
2.4%
i6853
 
2.3%
c6853
 
2.3%
Other values (5)11667
 
4.0%
Common
ValueCountFrequency (%)
/54454
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII348092
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T108908
31.3%
A54454
15.6%
/54454
15.6%
O54454
15.6%
r14499
 
4.2%
e10676
 
3.1%
t10676
 
3.1%
o7646
 
2.2%
D6952
 
2.0%
i6853
 
2.0%
Other values (6)18520
 
5.3%

is_repeated_guest
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0
63458 
1
 
1771

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters65229
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

Length

2022-07-06T08:03:04.485028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:04.614757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

Most occurring characters

ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number65229
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Common65229
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII65229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
063458
97.3%
11771
 
2.7%

previous_cancellations
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1289150531
Minimum0
Maximum26
Zeros59591
Zeros (%)91.4%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:04.724472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9653256977
Coefficient of variation (CV)7.488075864
Kurtosis491.7489354
Mean0.1289150531
Median Absolute Deviation (MAD)0
Skewness20.72991635
Sum8409
Variance0.9318537027
MonotonicityNot monotonic
2022-07-06T08:03:04.849903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
059591
91.4%
15367
 
8.2%
254
 
0.1%
351
 
0.1%
1135
 
0.1%
2428
 
< 0.1%
2519
 
< 0.1%
2618
 
< 0.1%
1917
 
< 0.1%
513
 
< 0.1%
Other values (5)36
 
0.1%
ValueCountFrequency (%)
059591
91.4%
15367
 
8.2%
254
 
0.1%
351
 
0.1%
45
 
< 0.1%
513
 
< 0.1%
67
 
< 0.1%
1135
 
0.1%
1312
 
< 0.1%
1411
 
< 0.1%
ValueCountFrequency (%)
2618
< 0.1%
2519
< 0.1%
2428
< 0.1%
211
 
< 0.1%
1917
< 0.1%
1411
 
< 0.1%
1312
 
< 0.1%
1135
0.1%
67
 
< 0.1%
513
 
< 0.1%

previous_bookings_not_canceled
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct58
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1077128271
Minimum0
Maximum58
Zeros63686
Zeros (%)97.6%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:05.009834image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum58
Range58
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.326637954
Coefficient of variation (CV)12.31643426
Kurtosis758.3305593
Mean0.1077128271
Median Absolute Deviation (MAD)0
Skewness24.30889795
Sum7026
Variance1.75996826
MonotonicityNot monotonic
2022-07-06T08:03:05.159985image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
063686
97.6%
1645
 
1.0%
2252
 
0.4%
3136
 
0.2%
4114
 
0.2%
589
 
0.1%
656
 
0.1%
734
 
0.1%
825
 
< 0.1%
923
 
< 0.1%
Other values (48)169
 
0.3%
ValueCountFrequency (%)
063686
97.6%
1645
 
1.0%
2252
 
0.4%
3136
 
0.2%
4114
 
0.2%
589
 
0.1%
656
 
0.1%
734
 
0.1%
825
 
< 0.1%
923
 
< 0.1%
ValueCountFrequency (%)
581
< 0.1%
571
< 0.1%
561
< 0.1%
551
< 0.1%
541
< 0.1%
531
< 0.1%
521
< 0.1%
511
< 0.1%
501
< 0.1%
491
< 0.1%

reserved_room_type
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
A
50501 
D
9387 
E
 
2448
F
 
1213
B
 
860
Other values (2)
 
820

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters1043664
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowC
4th rowC
5th rowA

Common Values

ValueCountFrequency (%)
A 50501
77.4%
D 9387
 
14.4%
E 2448
 
3.8%
F 1213
 
1.9%
B 860
 
1.3%
G 592
 
0.9%
C 228
 
0.3%

Length

2022-07-06T08:03:05.319850image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:05.490844image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a50501
77.4%
d9387
 
14.4%
e2448
 
3.8%
f1213
 
1.9%
b860
 
1.3%
g592
 
0.9%
c228
 
0.3%

Most occurring characters

ValueCountFrequency (%)
978435
93.8%
A50501
 
4.8%
D9387
 
0.9%
E2448
 
0.2%
F1213
 
0.1%
B860
 
0.1%
G592
 
0.1%
C228
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator978435
93.8%
Uppercase Letter65229
 
6.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A50501
77.4%
D9387
 
14.4%
E2448
 
3.8%
F1213
 
1.9%
B860
 
1.3%
G592
 
0.9%
C228
 
0.3%
Space Separator
ValueCountFrequency (%)
978435
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common978435
93.8%
Latin65229
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A50501
77.4%
D9387
 
14.4%
E2448
 
3.8%
F1213
 
1.9%
B860
 
1.3%
G592
 
0.9%
C228
 
0.3%
Common
ValueCountFrequency (%)
978435
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1043664
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
978435
93.8%
A50501
 
4.8%
D9387
 
0.9%
E2448
 
0.2%
F1213
 
0.1%
B860
 
0.1%
G592
 
0.1%
C228
 
< 0.1%

booking_changes
Real number (ℝ≥0)

ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.19426942
Minimum0
Maximum17
Zeros56195
Zeros (%)86.2%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:05.660154image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum17
Range17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.593837782
Coefficient of variation (CV)3.056774359
Kurtosis77.63499906
Mean0.19426942
Median Absolute Deviation (MAD)0
Skewness6.021001643
Sum12672
Variance0.3526433113
MonotonicityNot monotonic
2022-07-06T08:03:05.780290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
056195
86.2%
16615
 
10.1%
21754
 
2.7%
3413
 
0.6%
4149
 
0.2%
542
 
0.1%
619
 
< 0.1%
718
 
< 0.1%
87
 
< 0.1%
95
 
< 0.1%
Other values (6)12
 
< 0.1%
ValueCountFrequency (%)
056195
86.2%
16615
 
10.1%
21754
 
2.7%
3413
 
0.6%
4149
 
0.2%
542
 
0.1%
619
 
< 0.1%
718
 
< 0.1%
87
 
< 0.1%
95
 
< 0.1%
ValueCountFrequency (%)
171
 
< 0.1%
161
 
< 0.1%
152
 
< 0.1%
142
 
< 0.1%
134
 
< 0.1%
102
 
< 0.1%
95
 
< 0.1%
87
 
< 0.1%
718
< 0.1%
619
< 0.1%

days_in_waiting_list
Real number (ℝ≥0)

ZEROS

Distinct97
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.334421806
Minimum0
Maximum259
Zeros62005
Zeros (%)95.1%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:05.941876image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum259
Range259
Interquartile range (IQR)0

Descriptive statistics

Standard deviation18.23960574
Coefficient of variation (CV)5.470095508
Kurtosis65.57762721
Mean3.334421806
Median Absolute Deviation (MAD)0
Skewness7.397372748
Sum217501
Variance332.6832177
MonotonicityNot monotonic
2022-07-06T08:03:06.100725image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
062005
95.1%
39185
 
0.3%
58164
 
0.3%
44137
 
0.2%
31126
 
0.2%
3595
 
0.1%
6989
 
0.1%
4687
 
0.1%
8780
 
0.1%
6380
 
0.1%
Other values (87)2181
 
3.3%
ValueCountFrequency (%)
062005
95.1%
13
 
< 0.1%
22
 
< 0.1%
359
 
0.1%
420
 
< 0.1%
52
 
< 0.1%
611
 
< 0.1%
84
 
< 0.1%
913
 
< 0.1%
1027
 
< 0.1%
ValueCountFrequency (%)
25910
 
< 0.1%
23635
0.1%
22410
 
< 0.1%
21521
< 0.1%
20715
 
< 0.1%
1931
 
< 0.1%
18745
0.1%
17830
< 0.1%
17650
0.1%
17419
 
< 0.1%

customer_type
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
Transient
45493 
Transient-Party
16703 
Contract
 
2746
Group
 
287

Length

Max length15
Median length9
Mean length10.47670515
Min length5

Characters and Unicode

Total characters683385
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransient
2nd rowTransient
3rd rowTransient
4th rowTransient
5th rowTransient

Common Values

ValueCountFrequency (%)
Transient45493
69.7%
Transient-Party16703
 
25.6%
Contract2746
 
4.2%
Group287
 
0.4%

Length

2022-07-06T08:03:06.252144image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:06.408520image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
transient45493
69.7%
transient-party16703
 
25.6%
contract2746
 
4.2%
group287
 
0.4%

Most occurring characters

ValueCountFrequency (%)
n127138
18.6%
t84391
12.3%
r81932
12.0%
a81645
11.9%
T62196
9.1%
s62196
9.1%
i62196
9.1%
e62196
9.1%
y16703
 
2.4%
-16703
 
2.4%
Other values (7)26089
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter584750
85.6%
Uppercase Letter81932
 
12.0%
Dash Punctuation16703
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n127138
21.7%
t84391
14.4%
r81932
14.0%
a81645
14.0%
s62196
10.6%
i62196
10.6%
e62196
10.6%
y16703
 
2.9%
o3033
 
0.5%
c2746
 
0.5%
Other values (2)574
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T62196
75.9%
P16703
 
20.4%
C2746
 
3.4%
G287
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
-16703
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin666682
97.6%
Common16703
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n127138
19.1%
t84391
12.7%
r81932
12.3%
a81645
12.2%
T62196
9.3%
s62196
9.3%
i62196
9.3%
e62196
9.3%
y16703
 
2.5%
P16703
 
2.5%
Other values (6)9386
 
1.4%
Common
ValueCountFrequency (%)
-16703
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII683385
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n127138
18.6%
t84391
12.3%
r81932
12.0%
a81645
11.9%
T62196
9.1%
s62196
9.1%
i62196
9.1%
e62196
9.1%
y16703
 
2.4%
-16703
 
2.4%
Other values (7)26089
 
3.8%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size509.7 KiB
0
61547 
1
 
3670
2
 
11
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters65229
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

Length

2022-07-06T08:03:06.553314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-06T08:03:06.705838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number65229
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common65229
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII65229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
061547
94.4%
13670
 
5.6%
211
 
< 0.1%
31
 
< 0.1%

total_of_special_requests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5127627282
Minimum0
Maximum5
Zeros40671
Zeros (%)62.4%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:06.824865image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7525902719
Coefficient of variation (CV)1.467716412
Kurtosis1.589829586
Mean0.5127627282
Median Absolute Deviation (MAD)0
Skewness1.41455973
Sum33447
Variance0.5663921173
MonotonicityNot monotonic
2022-07-06T08:03:06.944147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
040671
62.4%
116958
26.0%
26444
 
9.9%
31036
 
1.6%
4107
 
0.2%
513
 
< 0.1%
ValueCountFrequency (%)
040671
62.4%
116958
26.0%
26444
 
9.9%
31036
 
1.6%
4107
 
0.2%
513
 
< 0.1%
ValueCountFrequency (%)
513
 
< 0.1%
4107
 
0.2%
31036
 
1.6%
26444
 
9.9%
116958
26.0%
040671
62.4%

total_nights
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.049977771
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size509.7 KiB
2022-07-06T08:03:07.080107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile7
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.738108078
Coefficient of variation (CV)0.5698756545
Kurtosis0.6805428651
Mean3.049977771
Median Absolute Deviation (MAD)1
Skewness1.005460633
Sum198947
Variance3.021019692
MonotonicityNot monotonic
2022-07-06T08:03:07.190192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
216795
25.7%
315652
24.0%
112122
18.6%
49629
14.8%
54288
 
6.6%
73853
 
5.9%
62078
 
3.2%
8542
 
0.8%
9152
 
0.2%
10118
 
0.2%
ValueCountFrequency (%)
112122
18.6%
216795
25.7%
315652
24.0%
49629
14.8%
54288
 
6.6%
62078
 
3.2%
73853
 
5.9%
8542
 
0.8%
9152
 
0.2%
10118
 
0.2%
ValueCountFrequency (%)
10118
 
0.2%
9152
 
0.2%
8542
 
0.8%
73853
 
5.9%
62078
 
3.2%
54288
 
6.6%
49629
14.8%
315652
24.0%
216795
25.7%
112122
18.6%

Interactions

2022-07-06T08:02:56.229734image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:35.499904image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:37.499854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:39.743126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.751556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.824964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:45.973862image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.921152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.945172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.019949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:54.103294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:56.410406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:35.715156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:37.685358image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:39.910225image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.935116image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.996674image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:46.139963image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:48.100052image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:50.124271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.191635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:54.269954image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:56.629861image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:35.904886image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:37.889841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:40.100145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:42.129539image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:44.181005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:46.323321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:48.289936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:50.319898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.375121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:54.459914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:56.829868image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.075035image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:38.084890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:40.276239image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:42.309939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:44.390194image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:46.500065image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:48.475232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:50.500101image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.545143image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:54.650561image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.029912image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.249827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:38.279899image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:40.467864image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:42.499988image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:44.579868image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:46.679738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:48.664816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:50.695120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.729780image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:54.830012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.220021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.429919image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:38.580825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:40.654316image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:42.690263image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:44.765080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:46.865200image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:48.844838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:50.879827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:52.904796image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:55.010012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.400100image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.610244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:38.774963image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:40.830097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:42.874794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:44.944788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.040086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.025182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:51.064845image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:53.209885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:55.189933image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.575807image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.796657image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:38.969854image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.019796image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.070229image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:45.250166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.213940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.206019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:51.269823image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:53.410102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:55.390171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.759898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:36.981893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:39.174748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.214896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.261344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:45.450497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.406389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.414959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:51.466430image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:53.594839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:55.630216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:57.940188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:37.154879image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:39.370095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.391493image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.450005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:45.630059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.581826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.590041image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:51.648387image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:53.769727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:55.830006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:58.114772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:37.330195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:39.558146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:41.576279image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:43.639770image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:45.810161image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:47.759761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:49.770145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:51.839839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:53.939897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-07-06T08:02:56.030133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2022-07-06T08:03:07.350006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-06T08:03:07.749921image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-06T08:03:08.141603image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-06T08:03:08.536807image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-06T08:03:08.844020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-06T08:02:58.430084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-06T08:02:59.430135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrydistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typebooking_changesdays_in_waiting_listcustomer_typerequired_car_parking_spacestotal_of_special_requeststotal_nights
0007.02015July271011.00.00.0BBGBRDirect000A00Transient001
11014.02015July271022.00.00.0BBGBRTA/TO000A00Transient012
2200.02015July271022.00.00.0BBPRTDirect000C00Transient002
3309.02015July271022.00.00.0FBPRTDirect000C00Transient012
44185.02015July271032.00.00.0BBPRTTA/TO000A00Transient013
55175.02015July271032.00.00.0HBPRTTA/TO000D00Transient003
66123.02015July271042.00.00.0BBPRTTA/TO000E00Transient004
77018.02015July271042.01.00.0HBESPTA/TO000G10Transient014
88068.02015July271042.00.00.0BBIRLTA/TO000D00Transient034
99037.02015July271042.00.00.0BBPRTTA/TO000E00Contract004

Last rows

idis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrydistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typebooking_changesdays_in_waiting_listcustomer_typerequired_car_parking_spacestotal_of_special_requeststotal_nights
6521984016042.02016December5331222.01.00.0BBESPTA/TO000A10Transient014
6522084022023.02016December5330232.00.01.0BBHUNTA/TO000D10Transient035
6522184023042.02016December5329252.00.00.0BBUSADirect000E00Transient007
6522284050096.02016December5331232.00.00.0HBCHETA/TO000A00Transient005
6522384056023.02016December5330242.00.00.0BBNLDTA/TO000D00Transient026
6522484057023.02016December5330242.00.00.0BBCHNTA/TO000D00Transient026
6522584063053.02016December5331232.00.00.0HBFRATA/TO000D00Transient035
652268409407.02016December5331242.00.00.0BBFRATA/TO000D00Transient016
6522784117017.02016December5330252.00.00.0SCFRATA/TO000A00Transient017
65228841210107.02016December5331252.00.00.0BBFRATA/TO000A00Transient007